Refine your search:     
Report No.
 - 
Search Results: Records 1-3 displayed on this page of 3
  • 1

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Oral presentation

Implementation and evaluation of a communication avoiding Krylov subspace method, CA-GMRES, on HA-PACS/TCA

Matsumoto, Kazuya*; Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu

no journal, , 

Communication avoiding (CA) Krylov methods are promising solutions for communication bottlenecks on supercomputers based on many core processors or accelerators. In this work, we implemented the CA-GMRES method on a GPU cluster, the HA-PACS, and evaluated its performance on a non-symmetric matrix solver from a nuclear CFD code. The result shows that the CA-GMRES method is significantly faster than the conventional Krylov methods such as the GMRES method and the GCR method.

Oral presentation

Targeting exa-scale systems; Performance portability and scalable data analysis

Asahi, Yuichi; Maeyama, Shinya*; Bigot, J.*; Garbet, X.*; Grandgirard, V.*; Obrejan, K.*; Padioleau, T.*; Fujii, Keisuke*; Shimokawabe, Takashi*; Watanabe, Tomohiko*; et al.

no journal, , 

We will demonstrate the performance portable implementation of a kinetic plasma code over CPUs, Nvidia and AMD GPUs. We will also discuss the performance portability of the code with C++ parallel algorithm. Deep learning based surrogate models for fluid simulations will also be demonstrated.

Oral presentation

Targeting exa-scale systems; Performance portability and scalable data analysis

Asahi, Yuichi; Maeyama, Shinya*; Bigot, J.*; Garbet, X.*; Grandgirard, V.*; Obrejan, K.*; Padioleau, T.*; Fujii, Keisuke*; Shimokawabe, Takashi*; Watanabe, Tomohiko*; et al.

no journal, , 

We will demonstrate the performance portable implementation of a kinetic plasma code over CPUs, Nvidia and AMD GPUs. We will also discuss the performance portability of the code with C++ parallel algorithm. Deep learning based surrogate models for fluid simulations will also be demonstrated.

3 (Records 1-3 displayed on this page)
  • 1